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NOTES ON AVOIDING "GO TO" STATEMENTS 


D. E. Khuth and R. W. Floyd 


During the last decade there has been a growing sentiment that the 
use of "go to" statements is undesirable, or actually harmful. This 
attitude is apparently inspired by the idea that programs expressed 
solely in terms of conventional iterative constructions ("for", "while", 
etc.) are more readable and more easily proved correct. In this note 
we will make a few exploratory observations about the use and disuse of 
go to statements, based on two typical programming examples (from 
"symbol table searching" and "backtracking"). 

In the first place let us consider systematic ways for eliminating 
go to statements. There are two apparent ways to achieve this: 

(a) Recursive procedure method . Suppose that each statement of a 
program is labeled. Replace each labeled statement 

L: S 


by 


procedure L; begin S; If end 

where L f is the static successor of the statement S . A go to statement 
becomes simply a procedure call. The program ends by calling a null 
procedure. This construction shows that the mere elimination of go to 
statements does not automatically make a program better or easier to 
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follow; "go to tT is in some sense a special case of the procedure calling 
mechanism. (it is instructive in fact to consider this construction in 
reverse, realizing that it is sometimes more efficient to replace 
procedure calls by go to statements!) 

(b) Regular expression method. For convenience, imagine a program 
expressed in flowchart form, as a directed graph. It is well known that 
all paths through this graph can be represented by "regular expressions" 
involving the operations of concatenation, alternation, and "star"; these 
latter correspond to familiar constructions in programming languages 
which do not depend on go to statements. Therefore it appears that 
r go to T statements can be eliminated, although it may be necessary to 
duplicate the code for other statements in several places. This process 
is essentially what John Cocke calls "node splitting". 

Consider, for example the following well-known programming 
situation: 

for i := 1 step 1 until n do 

if A [ i ] = x then go to found; 
not found: n := i; A[i] := x; B[i] := 0; 
found: B[i] := B[i]+1; 

(Let us assume, for convenience, that i = n+1 if the for loop is 
exhausted.) It is not obvious that the go to statement here is all that 
unsightly, but let us suppose that we are reactionary enough that we 
really want to abolish them from programming languages. [See Dijkstra 
Comm . ACM 11 (1968), 1^7 -148.] One way to avoid the go to is to use a 
recursive procedure: 
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procedure find; 

if i > n then begin n := i; A[i] := x; B[i] := 0 end 
else if A [ i ] f x then begin i := i+1; find end ; 
i := 1; find; B[i] := B[i]+1; 

An optimizing compiler could perhaps produce the same code for both 
programs, but again it is debatable which program is most readable and 
simple. 

Other solutions change the structure of the program slightly: 

(a) i := 1; 

while i < n and A[i] / x do i := i+1; 

if i > n then begin n := i; A[i] := x; B[i] := 0 end ; 

B[i] := B[i]+1; 

(b) i := 1; 

while A[i] f x do 
begin i := i+1; 

if i > n then begin n := i; A[i] := x; B[i] := 0 end 
end ; 

B[i] := B[i]+1; 

Solution (b) assumes that n > 0 . Both solutions increase the amount of 
calculation that is specified: (a) tests "i > n" twice, while (b) 

tests "A[i] / x" after n has been increased. 

The flowchart of the original program is: 
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START 



(T = 


T, = 


T = 


CT^ = 


CT„ = 


CT, = 


STOP 


(*) 


i := 1 
i > n ? 

A[i] = x ? 
i := i+1 

n := i; A[i] := x; B[i] := 0 
B[i] := B[i]+1 


By a suitable extension of BNP we can write a grammar for all 
flowcharts producible by a language without procedure calls or go to 
statements : 


<program> : : = START 

1 

<statement> 

STOP 


i 

<statement> : : = 

i 

i 

<basic statement> 

I 
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1 

<basie statement> 

i 

<statement> 

i 

Conditional statement> 

i 


i 

<iterative statement > 

i 
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i 

Conditional statement> :: = 

i 



v 



Here cr denotes a Tt statement TT and t denotes a "test". 

We have not completely analyzed this grammar, although it appears to 
be unambiguous; there is probably an efficient parsing algorithm -which 
will decide -whether or not a given flowchart is derivable from the 
grammar, constructing a derivation -when one exists. But we can easily 
prove that the above flowchart is not producible by this grammar. In fact, 
a stronger result is true: 


Theorem . No flowchart producible by the above grammar specifies 

precisely the computations of the above example flowchart (*). 

This theorem contradicts our observations above about regular 
expressions being reducible to concatenation, alternation, and iteration; 
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for our flowcharts provide each of these operations, yet they cannot 
reproduce the computations in (*). What went wrong? Perhaps it is 
that regular expressions are nondeterministic, while computations are 
inherently deterministic; hut no, it is well known that regular expressions 
may be considered to be deterministic. The difference really lies in 
the nature of computational tests. 

Thus, let us consider a special class ft of regular expressions; 
ft describes all computational sequences (paths in the flowchart) 
producible by flowcharts corresponding to a language without go-to 
statements : 

the empty sequence is in ft. 

a eft, for all statements a. 

ftlft 2 eft, for all ft and ft 2 eft. 

(Tyftilx^ftg), for all ft^ and ftg€ft and all tests t . 

( T yfti)* T N £ ft, ( t u r i)* t y €R > for a11 ft i eR and a11 tes ‘ fcs T * 

Here the subscripts Y and N denote the "YES" or "NO" branches in 
the flowchart. 

To prove the theorem, consider the computational sequences producible 
by the flowchart (*); they may be described by the regular expression 

<, i (T m T aj 0 2 ) * (T i/3l T ui T 2Y ) % ' <**> 

We will show that the corresponding regular event (the sequences defined 
by this regular expression) cannot be defined by any of the regular 
expressions in ft . 
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Every regular expression in ft which specifies infinitely many 
sequences includes some test t with one of the following two properties: 

(i) Every occurrence of is followed by at least one occurrence 

of t h • 

or (ii) Every occurrence of t is followed by at least one occurrence 
of T y . 

Tbe infinitely many sequences specified by (**) do not have any 
such test since the sequences include 


CT l T lY a 3% ' a i T I^ T 2Y a 4 ’ G l T l^ T 2N a 2 T lY a 3 a ^ ' 

Hence no regular expression in ft can produce the regular event (**), 
and the theorem is proved. 


I 


Perhaps the reader feels that the above proof is too "slick", or 
that something has been concealed. In fact, this is quite true; we 
have penalized the class of flowcharts too severely! Compound tests 
such as f, T y and t^" have not been allowed sufficient latitude. Our 
flowchart grammar should be extended as follows: Replace 



in the definitions of Conditional statement> and <iterative statement> 

by 


i 

<condition> 
YE^/ \n0 
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and add the new definition 



The grammar now becomes ambiguous in several cases, although the ambiguity 
can be removed at the expense of some complications which are irrelevant 
here. More important is the change to grammar R , where we are allowed 
to substitute 


T ’ 

Y 

for 

t n ' 

T * 

N 

for 

t y 

or t 't " 

N N 

for 

t n ’ 

( T * 1 T * T " ) 

1 N 1 NY ; 

for 

t y 

whenever 

are 

tests . 

Thus since 

a i ( 

T N°2^ T yV R * S ° 1S 


2N CT 2^ 


UT T 2Y^ a 4 * 




and this is the same as (**) with deleted. The theorem above is 

almost false! But we can still prove it by an exhaustive case analysis, 

considering all possible substitutions of compound tests and showing 

that none are permissible because of the presence of a . 

o 

The theorem becomes almost false in another sense too, when compound 
conditions are considered, since the expression 

a i^ T lN T 2N a 2' > ^ T lYl T Ui T 2Y^ T lY Cr 3l T lN^ CT 4 
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is in ft and it differs from (**) only in that becomes t t^. an( ^ 

T 1N T 2Y ^ ecomes t ij^ T 2Y T 1N * The se< L uences are essentially the same 
except that redundant tests are made. We could therefore consider 
equivalence operations on regular expressions, allowing commutativity 
of successive tests, and an idempotent law T y T y = T y • Tn case 

our theorem would become false; but we can easily find another flowchart 
for which the theorem still applies: Simply put another statement box 

between and . Then no two tests are adjacent, and our original 

" slick” proof immediately shows that the regular event defined by 

0 l (T U)'’5 T 2R“2>*( T lY ,I 3l T M“5 T a ) % 

is not equivalent to any regular event definable with ft . (When no 
two tests are adjacent compound conditions cannot appear, nor do any of 
the equivalences apply, so none of the extensions affect the original 
proof of the theorem.) 

Therefore our "slick" proof is vindicated, and we have proved the 
existence of programs whose go to statements cannot be eliminated 
without introducing procedure calls . 

Let us now consider a second example program, taken this time from 
a typical "backtracking" or exhaustive enumeration application. Most 
backtrack problems can be abstracted into the following form: 
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start: 


m[l] := 0; k := 0; 
up: k := k+1; list(k); a[k] := m[k]; 

try: if a[k] < m[k+l] then begin move (a [ k ] ) ; go tc) up end ; 

down: k := k-1; 

if k = 0 then go to done; 

unmove (a[k]); 

a[k] := a[k]+l; go to try; 

done: 

Here the procedures list, move, unmove may be regarded as manipulating 
a variable-width stack s[0], s[l], . . . of possible choices in this 
abstracted algorithm. Procedure list(k) determines all possible choices 
at the k-th level of backtracking, based on the previously made choices 
a[l], . . .,a[k-l] . If there are c choices now possible, list(k) will 
set m[k+l] := m[k]+c , and it will also set the stack entries 
s[m[k]+l], . . ., s[m[k]+c] to identify the choices. (Note that c can 
be zero. The choices might be, for example, where to place the k-th 
queen on a chessboard, given positions of k-1 other queens, if we are 
trying to solve the queens' problem.) Procedure move(t) makes the 
decision to choose alternative s [ t ] ; this usually means that some 
internal tables need to be updated. Procedure unmove (t) reverses the 
decisions made by move(t) . 

It is not necessary to understand the exact mechanism of this 
construction, although people familiar with backtracking should find 
the previous paragraph self-explanatory; the main point is that essentially 
all backtracking programs have the form of the above program, when 
appropriate sequences of code are substituted for list(k) , move(a[k]) , 
and unmove(a[k]) , hence the program is worth considering from the 
standpoint of go-to elimination. 
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First we can eliminate go-to's by introducing a procedure: 

procedure backtrack(k) ; value k; integer k; 
begin list(k); a[k] :=m[k]; 
while a[k] < m[k+l] do 

begin move(a[k] ) ; backtrack(k+l) ; unmove(a[k] ) ; 
a[k] := a[k]+l 

end 

end backtrack; 
m[l] := 0; backtrack(l) ; 

This use of recursion is rather clean, so the above program is attractive 
except for the procedure -calling overhead (which is important since 
backtrack programs typically involve many millions of iterations). 

It is an interesting exercise to prove this program equivalent to our 
first version. 

Now let's try to eliminate the go to statements without introducing 
a new procedure. The flowchart is: 

START 

0 ^ = m[l] := 0; k := 0 
0g = k := k+1; list(k); a[k] := m[k] 
= a[k] < m[k+l] 

0 ^ = move(a[k]) 

0 ^ = k := k-1 
= k = 0 

0 C = unmove(a[k]) ; a[k] := a[k]+l 



11 




Here we have the basic flowchart structure 


instead of the previous situation when we had 

It turns out that node -splitting works in this case but not the other; 
we can make two copies of node cr^ in the above flowchart and we 
obtain 

START 



STOP 

This diagram obviously satisfies the conditions of our flowchart grammar 
above, so we can avoid the go to statements. 
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What is the resulting program? Our flowchart grammar above allows 
more general iterative statements than present-day programming languages 
will admit. A general iterative construction might be written 

begin loop cr^; exit loop if t^; a ^ end loop ; (#**) 

but today* s languages only consider the case that is empty: 

while > do a 2 ; 

or if a ^ i- s empty: 

do until 

We can always rewrite (***) in the equivalent form 
cr^; while — i do begin a 2 ; cr^ end ; 

but this is quite unattractive when cr^ is long, so a programmer will 
certainly prefer to use go to statements in that case. If we want to 
teach programmers to avoid go to statements, we must provide them with 
a sufficiently rich syntax of iterative statements to serve as a 
substitute. 

Using (***) leads to the following program for backtracking without 
go to statements: 

m[l] := 0; k := 1; list(l); a[l] := 0; 
begin loop 

while a[k] < m[k+l] do 
begin move(a[k] ) ; 

k := k+1; list(k); a[k] := m[k] 
end ; 

k := k-1; 

exit loop if k = 0; 

unmove(a[k]) ; a[k] := a[k]+l 
end loop; 
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This code, although free of "go to statements", involves an uncomfortable 


element which may not make it very palatable: the " while a[k] < m[k+l]" 

is a rather peculiar condition since k varies and the test involves 
different variables each time. This is quite different in effect from 
the appearance of the same clause in our recursive procedure backtrack(k) 
It is possible to think of the program in a fairly natural way nevertheless 
for example (in tree language) as follows: 

start at root of search tree; 
begin loop 

while possible to go down and left in tree do so; 
move up one level in the tree; 
exit loop if at the root; 

move to the right in the tree; 
end loop ; 

this is a typical tree traversal algorithm. Yet it is debatable whether 
or not the elimination of go to statements was an improvement. 

The syntax in (***) is perhaps not the best way to improve 
iteration statements. An alternative proposal, based on some unpublished 
ideas of Wirth, has just been implemented as an extension to Stanford’s 
ALGOL W compiler: The statement 

repeat <block> 
has the effect of 

L^: <block>; go to L^; L^: 

and the statement 
exit 

has the effect of 
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go to Lg 

where L ^ is the second implicit label corresponding to the smallest 
repeat block statically enclosing the exit statement. Thus, (***) 
becomes 

repeat begin cr_^; if then exit ; end ; 

and we can even write our symbol table search routine without go to 
statements: 

i := 1; 

repeat begin 

while i <n do if A[i] = x then exit else i := i+1; 
n := i; A[i] := x; B[i] := 0; exit 
end ; 

B[i] := B[i]+1; 

Here the "repeat loop" is never repeated, but the desired effect has 
been achieved. It appears doubtful that this repeat -exit mechanism 
will be able to eliminate go to statements in general, since it only 
allows a "one-level exit"; further study of these issues is indicated. 


15 



